335 research outputs found
Information Access Using Neural Networks For Diverse Domains And Sources
The ever-increasing volume of web-based documents poses a challenge in efficiently accessing specialized knowledge from domain-specific sources, requiring a profound understanding of the domain and substantial comprehension effort. Although natural language technologies, such as information retrieval and machine reading compression systems, offer rapid and accurate information retrieval, their performance in specific domains is hindered by training on general domain datasets. Creating domain-specific training datasets, while effective, is time-consuming, expensive, and heavily reliant on domain experts. This thesis presents a comprehensive exploration of efficient technologies to address the challenge of information access in specific domains, focusing on retrieval-based systems encompassing question answering and ranking.
We begin with a comprehensive introduction to the information access system. We demonstrated the structure of a information access system through a typical open-domain question-answering task. We outline its two major components: retrieval and reader models, and the design choice for each part. We focus on mainly three points: 1) the design choice of the connection of the two components. 2) the trade-off associated with the retrieval model and the best frontier in practice. 3) a data augmentation method to adapt the reader model, trained initially on closed-domain datasets, to effectively answer questions in the retrieval-based setting.
Subsequently, we discuss various methods enabling system adaptation to specific domains. Transfer learning techniques are presented, including generation as data augmentation, further pre-training, and progressive domain-clustered training. We also present a novel zero-shot re-ranking method inspired by the compression-based distance. We summarize the conclusions and findings gathered from the experiments.
Moreover, the exploration extends to retrieval-based systems beyond textual corpora. We explored the search system for an e-commerce database, wherein natural language queries are combined with user preference data to facilitate the retrieval of relevant products. To address the challenges, including noisy labels and cold start problems, for the retrieval-based e-commerce ranking system, we enhanced model training through cascaded training and adversarial sample weighting. Another scenario we investigated is the search system in the math domain, characterized by the unique role of formulas and distinct features compared to textual searches. We tackle the math related search problem by combining neural ranking models with structual optimized algorithms.
Finally, we summarize the research findings and future research directions
An Analysis of the profitability of commercial bank in China
The profitability of the Chinese banking industry is affected by many determinants. These factors include variables within each bank and several important macro variables that affect performance of commercial banks. The reasons for the impact of bank profitability may be related to government policies, we should take it seriously. Based on the annual data of 124 commercial banks in China in 2013-2018, we selected 9 variables include return on average assets (ROAA), capital adequacy ratio (EQTA), insolvency risk (Z-Score), bank size (TA), liquidity (NLTA), asset quality (LLRGL), cost efficiency (CTI), inflation rate (INF) and GDP growth rate (GDPGR). Using multi-collinearity, endogeneity test and GMM model to analyze the profitability of Chinese banking industry, it is concluded that the return on average assets of Chinese banking industry is positively correlated with GDP growth rate and insolvency risk and negatively correlated with bank size and cost efficiency. From the aspects of optimizing the Chinese banking business model, the government's policy towards on commercial banks and establishing a comprehensive financial supervision system to gives suggestions for increase the profitability of bank
Ultra-compact silicon nitride grating coupler for microscopy systems
Grating couplers have been widely used for coupling light between photonic chips and optical fibers. For various quantum-optics and bio-optics experiments, on the other hand, there is a need to achieve good light coupling between photonic chips and microscopy systems. Here, we propose an ultra-compact silicon nitride (SiN) grating coupler optimized for coupling light from a waveguide to a microscopy system. The grating coupler is about 4 by 2 mu m(2) in size and a 116 nm 1 dB bandwidth can be achieved theoretically. An optimized fabrication process was developed to realize suspended SiN waveguides integrated with these couplers on top of a highly reflective bottom mirror. Experimental results show that up to 53% (2.76 dB loss) of the power of the TE mode can be coupled from a suspended SiN waveguide to a microscopy system with a numerical aperture (NA) = 0.65. Simulations show this efficiency can increase up to 75% (1.25 dB loss) for NA = 0.95
Segatron: Segment-Aware Transformer for Language Modeling and Understanding
Transformers are powerful for sequence modeling. Nearly all state-of-the-art
language models and pre-trained language models are based on the Transformer
architecture. However, it distinguishes sequential tokens only with the token
position index. We hypothesize that better contextual representations can be
generated from the Transformer with richer positional information. To verify
this, we propose a segment-aware Transformer (Segatron), by replacing the
original token position encoding with a combined position encoding of
paragraph, sentence, and token. We first introduce the segment-aware mechanism
to Transformer-XL, which is a popular Transformer-based language model with
memory extension and relative position encoding. We find that our method can
further improve the Transformer-XL base model and large model, achieving 17.1
perplexity on the WikiText-103 dataset. We further investigate the pre-training
masked language modeling task with Segatron. Experimental results show that
BERT pre-trained with Segatron (SegaBERT) can outperform BERT with vanilla
Transformer on various NLP tasks, and outperforms RoBERTa on zero-shot sentence
representation learning.Comment: Accepted by AAAI 202
Approximating Human-Like Few-shot Learning with GPT-based Compression
In this work, we conceptualize the learning process as information
compression. We seek to equip generative pre-trained models with human-like
learning capabilities that enable data compression during inference. We present
a novel approach that utilizes the Generative Pre-trained Transformer (GPT) to
approximate Kolmogorov complexity, with the aim of estimating the optimal
Information Distance for few-shot learning. We first propose using GPT as a
prior for lossless text compression, achieving a noteworthy compression ratio.
Experiment with LLAMA2-7B backbone achieves a compression ratio of 15.5 on
enwik9. We justify the pre-training objective of GPT models by demonstrating
its equivalence to the compression length, and, consequently, its ability to
approximate the information distance for texts. Leveraging the approximated
information distance, our method allows the direct application of GPT models in
quantitative text similarity measurements. Experiment results show that our
method overall achieves superior performance compared to embedding and prompt
baselines on challenging NLP tasks, including semantic similarity, zero and
one-shot text classification, and zero-shot text ranking
- …